AITopics | maxout unit

The appendix is organized as follows. Appendix A Proofs related to activation patterns and activation regions. Appendix B Proofs related to the numbers of regions attained with positive probability. Appendix D Proofs related to the expected volume of activation regions. Appendix E Proofs related to the expected number of activation regions.

activation region, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

f2c3b258e9cd8ba16e18f319b3c88c66-Paper.pdf

Neural Information Processing SystemsAug-18-2025, 20:12:36 GMT

artificial intelligence, linear region, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Saxony > Leipzig (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

On the Number of Linear Regions of Deep Neural Networks

Guido F. Montufar, Razvan Pascanu, Kyunghyun Cho, Yoshua Bengio

Neural Information Processing SystemsFeb-8-2025, 17:39:12 GMT

We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep networks are able to sequentially map portions of each layer's input-space to the same output. In this way, deep models compute functions that react equally to complicated patterns of different inputs. The compositional structure of these functions enables them to re-use pieces of computation exponentially often in terms of the network's depth. This paper investigates the complexity of such compositional maps and contributes new theoretical results regarding the advantage of depth for neural networks with piecewise linear activation functions. In particular, our analysis is not specific to a single family of models, and as an example, we employ it for rectifier and maxout networks. We improve complexity bounds from pre-existing work and investigate the behavior of units in higher layers.

artificial intelligence, linear region, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Number of Linear Regions of Deep Neural Networks

Neural Information Processing SystemsMar-13-2024, 06:15:30 GMT

We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep networks are able to sequentially map portions of each layer's input-space to the same output. In this way, deep models compute functions that react equally to complicated patterns of different inputs. The compositional structure of these functions enables them to re-use pieces of computation exponentially often in terms of the network's depth. This paper investigates the complexity of such compositional maps and contributes new theoretical results regarding the advantage of depth for neural networks with piecewise linear activation functions. In particular, our analysis is not specific to a single family of models, and as an example, we employ it for rectifier and maxout networks. We improve complexity bounds from pre-existing work and investigate the behavior of units in higher layers.

activation, linear region, neural network, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sharp bounds for the number of regions of maxout networks and vertices of Minkowski sums

Montúfar, Guido, Ren, Yue, Zhang, Leon

arXiv.org Artificial IntelligenceSep-1-2022

We present results on the number of linear regions of the functions that can be represented by artificial feedforward neural networks with maxout units. A rank-k maxout unit is a function computing the maximum of $k$ linear functions. For networks with a single layer of maxout units, the linear regions correspond to the upper vertices of a Minkowski sum of polytopes. We obtain face counting formulas in terms of the intersection posets of tropical hypersurfaces or the number of upper faces of partial Minkowski sums, along with explicit sharp upper bounds for the number of regions for any input dimension, any number of units, and any ranks, in the cases with and without biases. Based on these results we also obtain asymptotically sharp upper bounds for networks with multiple layers.

arrangement, linear region, polytope, (16 more...)

arXiv.org Artificial Intelligence

2104.08135

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(9 more...)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On the Expected Complexity of Maxout Networks

Tseran, Hanna, Montúfar, Guido

arXiv.org Machine LearningJul-1-2021

Learning with neural networks relies on the complexity of the representable functions, but more importantly, the particular assignment of typical parameters to functions of different complexity. Taking the number of activation regions as a complexity measure, recent works have shown that the practical complexity of deep ReLU networks is often far from the theoretical maximum. In this work we show that this phenomenon also occurs in networks with maxout (multi-argument) activation functions and when considering the decision boundaries in classification tasks. We also show that the parameter space has a multitude of full-dimensional regions with widely different complexity, and obtain nontrivial lower bounds on the expected complexity. Finally, we investigate different parameter initialization procedures and show that they can increase the speed of convergence in training.

activation region, linear region, maxout unit, (16 more...)

arXiv.org Machine Learning

2107.00379

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Saxony > Leipzig (0.04)
(5 more...)

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Conditional Computation for Continual Learning

Lin, Min, Fu, Jie, Bengio, Yoshua

arXiv.org Machine LearningJun-15-2019

Catastrophic forgetting of connectionist neural networks is caused by the global sharing of parameters among all training examples. In this study, we analyze parameter sharing under the conditional computation framework where the parameters of a neural network are conditioned on each input example. At one extreme, if each input example uses a disjoint set of parameters, there is no sharing of parameters thus no catastrophic forgetting. At the other extreme, if the parameters are the same for every example, it reduces to the conventional neural network. We then introduce a clipped version of maxout networks which lies in the middle, i.e. parameters are shared partially among examples. Based on the parameter sharing analysis, we can locate a limited set of examples that are interfered when learning a new example. We propose to perform rehearsal on this set to prevent forgetting, which is termed as conditional rehearsal. Finally, we demonstrate the effectiveness of the proposed method in an online non-stationary setup, where updates are made after each new example and the distribution of the received example shifts over time.

artificial intelligence, machine learning, rehearsal, (15 more...)

arXiv.org Machine Learning

1906.06635

Country: North America > Canada > Quebec > Montreal (0.05)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Tropical Approach to Neural Networks with Piecewise Linear Activations

Charisopoulos, Vasileios, Maragos, Petros

arXiv.org Machine LearningMay-22-2018

Traditional literature on pattern recognition and neural networks utilizes the linear Perceptron, a multiply-accumulate architecture fed into an (optional) activation function introduced by Rosenblatt [40], as the building block of a multitude of complex architectures modelling neural computation. In recent years, multilayered, complex architectures of neural networks have enjoyed an unprecedented growth in popularity, with the introduction of the paradigm of deep learning [4]. An illustrative example of the power of deep learning is Convolutional Neural Networks; although they were the state of the art when they were introduced, two decades ago [24], it wasn't until recently that they were systematically applied to image recognition challenges[23], achieving results comparable to humans (e.g.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

1805.08749

Country: